|  |  | I have to solve this at work.
- There are N individuals. N is unknown (thousands).
- There are M groups. M is known (hundreds).
- All individuals belong at least to 1 group, at most to K groups. K is 
known (tens).
- The exact number of individuals L for each group is known. It ranges from 
1 to several hundreds.
- The proportion P of how many individuals belong to a certain number of 
groups is known. For instance, we know that 50% of the individuals belong to 
only one group, 20% belong to 2 groups and so on.
- We have sampled the population for 12 of the larger groups and found that 
the ratio of [actual size of the population belonging to the 12 groups]/[sum 
of the population sizes (L) for the 12 groups] to be 80%.
With all the parameters above, it possible to estimate the size of N?
Of course, the maximum estimate is sum(L). Empirically, I'd say that a 
better estimate is 0.8 x sum(L), but I have the (also empirical) feeling 
that the actual number is much, much lower.
(The real-life problem deals with bibliometry: individuals are scientific 
articles and groups are keywords.)
G.
 Post a reply to this message
 |  |